Targeted Gene Metagenomic Data Analysis ◾ 299
metrics. First, we need to normalize the read counts across sample to adjust for any bias
arising from the different sequence depths and to make the comparison meaningful. The
normalization is performed by rarefying the count of feature table to a user-specified depth.
The lowest read count can be chosen as the user-defined depth. The lowest number of reads
is determined from a summary created from the feature table. The lowest count number
is then provided to the “--p-sampling-depth” parameter of the “diversity” plugin as a sam-
pling depth for all samples. Once the plugin command is executed, samples are drawn
without replacement so that each sample in the resulting table will have a total count equal
to that of sampling depth. Then, the alpha and beta metrics are computed. The following
script creates summary from the feature table to determine the lowest read number:
qiime feature-table summarize \
--i-table dada2/table_feat_sample_freq_filtered_yoga_dada2.qza \
--o-visualization dada2/table_feat_sample_freq_filtered_yoga_
dada2.qzv \
--m-sample-metadata-file data/sample-metadata.tsv
qiime tools view dada2/table_feat_sample_freq_filtered_yoga_dada2.qzv
When we study the summary, we can observe that the lowest read number for the samples
is 955 sequences. So, we can set the --p-sampling-depth parameter to 955. This step will
sub-sample the counts in each sample without replacement so that each sample in the
resulting table will have a total count of 955.
The “diversity” plugin requires a phylogenetic tree and feature table artifacts and the
sample metadata file as inputs and it outputs the alpha and beta diversity metrics saved into
the specified output directory.
qiime diversity core-metrics-phylogenetic \
--i-phylogeny trees2/rooted-tree-yoga_dada2.qza \
--i-table dada2/table_feat_sample_freq_filtered_yoga_dada2.qza \
--p-sampling-depth 955 \
--m-metadata-file data/sample-metadata.tsv \
--output-dir diversity-indices
The metrics would be saved to the output directory. We can use that metric to explore
the microbial composition of sample in the context of the grouping defined in the sample
metadata.
We will test for associations between categorical metadata columns and alpha diversity
data. We will do that here for the Faith Phylogenetic Diversity (a measure of community
richness) and Shannon diversity. The following commands will test for significant differ-
ences in the alpha diversity measures of samples:
qiime diversity alpha-group-significance \
--i-alpha-diversity diversity-indices/faith_pd_vector.qza \
--m-metadata-file data/sample-metadata.tsv \
--o-visualization diversity-indices/faith-pd-group-significance.qzv